The goal of this file is to arrange gene-centric C. elegans RNA-seq data from the modENCODE project (Gerstein et al 2010, provided by LaDeana Hillier in Dr. Bob Waterson’s lab at UW).
Each genome and gff file were downloaded from WormBase.org version WS290. Reads were aligned to each genome using STAR (v2.7.6a, –alignIntronMax 30000 –alignMatesGapMax 30000) and the species-specific WS290 GTF file for each genome. PCR duplicates were removed using “seldup” (Warner et al., 2019). Read counts were obtained for each gene (CDS region only which is labeled as “CDS” in C. briggsae and C. elegans and as “coding_exon” for C. remanei, C. japonica, and C. brenneri) using featureCounts (from the software package subread-2.0.6) using default settings. Only uniquely mapping reads were counted. Additionally read counts were obtained for the CDS regions and for the full transcripts using the featureCount options -M –fraction so that multimappers were counted by splitting them equally among all of the locations where they aligned.
Read data for each species was imported into R and annotated with information from WormBase ParaSite BiomaRT. Annotation information includes: UniProtKB number, Interpro terms, GO terms, and general Description information.
Raw reads were quantified as counts per million using the EdgeR package, then filtered to remove transcripts with low counts (less than 1 count-per-million). A list of discarded genes and their expression values across life stages was saved. Non-discarded gene values were normalized using the trimmed mean of M-values method (TMM, Robinson and Oshlack ) to permit between-samples comparisons. The mean-variance relationship was modeled using a precision weights approach Law et al 2014.
This document saves multiple files that are passed to a Shiny Web App for downstream browsing and on-demand analysis. Note that these files are saved in an Outputs folder; in order to make them accessible to a local version of the Shiny browser they need to be moved to appropriate subfolders within the App folder - the www sub folder (for .csv files) or the Data subfolder (for R objects). Stable copies are already located within those folders and do not need to be replaced unless the pre-processing steps change.
Note: Code chunks are collated and echoed at the end of the document in Appendix I.
Generate a digital gene expression list that could be easily shared/loaded for downstream filtering/normalization.
species_list <- tibble(species = c('elegans', 'briggsae', 'brenneri', 'remanei', 'japonica'))
for (x in species_list$species) {
# read in the study design ----
targets <- read_tsv(paste0("./Data/", x, "/", x,"_study_design.txt"),
na = c("", "NA", "na"), show_col_types = F)
# load pre-generated annotation information
load(paste0("./Outputs/",x,"_geneAnnotations"))
# import featureCount output into R ----
if (x == 'elegans' | x == 'briggsae') {
path <- paste0("./Data/", x, "/featureCount.C_", x,".", targets$Biological_ID, ".CDS.unique_only.ws290.txt")
} else {
path <- paste0("./Data/", x, "/featureCount.C_", x,".", targets$Biological_ID, ".coding_exon.unique_only.ws290.txt")
}
featureCountData<- rbindlist(lapply(path, fread), idcol="sample") %>%
mutate(sample = targets$sample[sample])
colnames(featureCountData)<-c('sample','geneID', 'Ce_ortholog', 'stableID', "location", "length", "count")
featureCountData_wider <- featureCountData %>%
dplyr::select(!c(Ce_ortholog, location, length)) %>%
pivot_wider(names_from = sample, values_from = count)
counts <- featureCountData_wider %>%
dplyr::select(-stableID)%>%
column_to_rownames(var = "geneID")
annotations_sub<-dplyr::select(featureCountData_wider, c(geneID, stableID)) %>%
left_join(annotations, by = "geneID")
# generate a DGEList
myDGEList <- DGEList(counts,
samples = targets$sample,
group = targets$group,
genes = annotations_sub)
output.name <- paste0(x, '_DGEList')
save(myDGEList,
file = file.path(output.path,
output.name))
}
The goal of this chunk is to:
ggplot2 to visualize the impact of filtering and
normalization on the data (see Output section, below)._vDGEList). iv) a matrix of variance-stabilized gene
expression data, extracted from the vDGEList
(_log2cpm_filtered_norm_voom.csv) - this data is
downloadable from within the Browser App.species_list <- tibble(species = c('elegans', 'briggsae', 'brenneri', 'remanei', 'japonica'))
for (x in species_list$species) {
# load pre-generated DGEList information
load(paste0("./Outputs/",x,"_DGEList"))
# load pre-generated annotation information
load(paste0("./Outputs/",x,"_geneAnnotations"))
# read in the study design ----
targets <- read_tsv(paste0("./Data/", x, "/", x,"_study_design.txt"),
na = c("", "NA", "na"), show_col_types = F)
# Generate life stage IDs
ids <- rep(cbind(targets$group),
times = nrow(myDGEList$counts)) %>%
as_factor()
# calculate and plot log2 counts per million ----
# use the 'cpm' function from EdgeR to get log2 counts per million
# then coerce into a tibble
log2.cpm.df.pivot <-cpm(myDGEList, log=TRUE) %>%
as_tibble(rownames = "geneID") %>%
setNames(nm = c("geneID", targets$sample)) %>%
pivot_longer(cols = -geneID,
names_to = "samples",
values_to = "expression") %>%
add_column(life_stage = ids)
# plot the data
p1 <- ggplot(log2.cpm.df.pivot) +
aes(x=samples, y=expression, fill=life_stage) +
geom_violin(trim = FALSE, show.legend = T, alpha= 0.7) +
stat_summary(fun = "median",
geom = "point",
shape = 20,
size = 2,
color = "black",
show.legend = FALSE) +
labs(y="log2 expression", x = "sample",
title = paste0("C. ", x, ": Log2 Counts per Million (CPM)"),
subtitle="unfiltered, non-normalized",
caption=paste0("produced on ", Sys.time())) +
theme_bw() +
scale_fill_manual(values = paletteer_d("rcartocolor::Prism")) +
coord_flip()
# Filter the data ----
# filter genes/transcripts with low counts
# how many genes had more than 1 CPM (TRUE) in at least n samples
# Note: The cutoff "n" is adjusted for the number of
# samples in the smallest group of comparison.
keepers <- cpm(myDGEList) %>%
rowSums(.>1)>=1
myDGEList.filtered <- myDGEList[keepers,]
ids.filtered <- rep(cbind(targets$group),
times = nrow(myDGEList.filtered)) %>%
as_factor()
log2.cpm.filtered.df.pivot <- cpm(myDGEList.filtered, log=TRUE) %>%
as_tibble(rownames = "geneID") %>%
setNames(nm = c("geneID", targets$sample)) %>%
pivot_longer(cols = -geneID,
names_to = "samples",
values_to = "expression") %>%
add_column(life_stage = ids.filtered)
p2 <- ggplot(log2.cpm.filtered.df.pivot) +
aes(x=samples, y=expression, fill=life_stage) +
geom_violin(trim = FALSE, show.legend = T, alpha= 0.7) +
stat_summary(fun = "median",
geom = "point",
shape = 20,
size = 2,
color = "black",
show.legend = FALSE) +
labs(y="log2 expression", x = "sample",
title = paste0("C. ", x, ": Log2 Counts per Million (CPM)"),
subtitle="filtered, non-normalized",
caption=paste0("produced on ", Sys.time())) +
theme_bw() +
scale_fill_manual(values = paletteer_d("rcartocolor::Prism")) +
coord_flip()
# Look at the genes excluded by the filtering step ----
# just to check that there aren't any with
# high expression that are in few samples
# Discarded genes
myDGEList.discarded <- myDGEList[!keepers,]
ids.discarded <- rep(cbind(targets$group),
times = nrow(myDGEList.discarded)) %>%
as_factor()
log2.cpm.discarded.df.pivot <- cpm(myDGEList.discarded, log=F) %>%
as_tibble(rownames = "geneID") %>%
setNames(nm = c("geneID", targets$sample)) %>%
pivot_longer(cols = -geneID,
names_to = "samples",
values_to = "expression") %>%
add_column(life_stage = ids.discarded)
# Genes that are above 1 cpm
log2.cpm.discarded.df.pivot %>%
dplyr::filter(expression > 1)
# Generate a matrix of discarded genes and their raw counts ----
discarded.gene.df <- log2.cpm.discarded.df.pivot %>%
pivot_wider(names_from = c(life_stage, samples),
names_sep = "-",
values_from = expression,
id_cols = geneID)%>%
left_join(annotations, by = "geneID")
# Save a matrix of discarded genes and their raw counts ----
discarded.gene.df %>%
write.csv(file = file.path(output.path,
"SsRNAseq_discardedGenes.csv"))
# Plot discarded genes
p.discarded <- ggplot(log2.cpm.discarded.df.pivot) +
aes(x=samples, y=expression, color=life_stage) +
geom_jitter(alpha = 0.3, show.legend = T)+
stat_summary(fun = "median",
geom = "point",
shape = 20,
size = 2,
color = "black",
show.legend = FALSE) +
labs(y="expression", x = "sample",
title = paste0("C. ", x, ": Counts per Million (CPM)"),
subtitle="genes excluded by low count filtering step, non-normalized",
caption=paste0("produced on ", Sys.time())) +
theme_bw() +
scale_fill_manual(values = paletteer_d("rcartocolor::Prism")) +
coord_flip()
# Normalize the data using a between samples normalization ----
# Source for TMM sample normalization here:
# https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-3-r25
myDGEList.filtered.norm <- calcNormFactors(myDGEList.filtered, method = "TMM")
log2.cpm.filtered.norm <- cpm(myDGEList.filtered.norm, log=TRUE)
log2.cpm.filtered.norm.df<- cpm(myDGEList.filtered.norm, log=TRUE) %>%
as_tibble(rownames = "geneID") %>%
setNames(nm = c("geneID", targets$sample))
log2.cpm.filtered.norm.df.pivot<-log2.cpm.filtered.norm.df %>%
pivot_longer(cols = -geneID,
names_to = "samples",
values_to = "expression") %>%
add_column(life_stage = ids.filtered)
p3 <- ggplot(log2.cpm.filtered.norm.df.pivot) +
aes(x=samples, y=expression, fill=life_stage) +
geom_violin(trim = FALSE, show.legend = T, alpha = 0.7) +
stat_summary(fun = "median",
geom = "point",
shape = 20,
size = 2,
color = "black",
show.legend = FALSE) +
labs(y="log2 expression", x = "sample",
title = paste0("C. ", x, ": Log2 Counts per Million (CPM)"),
subtitle="filtered, TMM normalized",
caption=paste0("produced on ", Sys.time())) +
theme_bw() +
scale_fill_manual(values = paletteer_d("rcartocolor::Prism")) +
coord_flip()
output.name <- paste0(x, '_FilteringNormalizationGraphs')
save(p1, p2, p3, p.discarded, discarded.gene.df,
file = file.path(output.path,
output.name))
output.name <- paste0(x, '_DGEList_filtered_normalized')
save(myDGEList.filtered.norm,
file = file.path(output.path,
output.name))
# Compute Variance-Stabilized DGEList Object ----
# Set up the design matrix ----
# no intercept/blocking for matrix, comparisons across group
group <- factor(targets$group)
design <- model.matrix(~0 + group)
colnames(design) <- levels(group)
# NOTE: To handle a 'blocking' design' or a batch effect, use:
# design <- model.matrix(~block + treatment)
# Model mean-variance trend and fit linear model to data ----
# Use VOOM function from Limma package to model the mean-variance relationship
# produces a variance-stabilized DGEList, that include precision
# weights for each gene to try and control for heteroscedasity.
# transforms count data to log2-counts per million
# Outputs: E = normalized expression values on the log2 scale
v.DGEList.filtered.norm <- voom(counts = myDGEList.filtered.norm,
design = design, plot = T)
colnames(v.DGEList.filtered.norm)<-targets$sample
colnames(v.DGEList.filtered.norm$E) <- paste(targets$group,
targets$sample,sep = '-')
# Save matrix of genes and their filtered, normalized, voom-transformed counts ----
# This is the count data that underlies the differential expression analyses in the Shiny app.
# Saving it here so that users of the app can access the input information.
output.name <- paste0(x, '_log2cpm_filtered_norm_voom.csv')
# Save in Shiny app download (www) folder
write.csv(v.DGEList.filtered.norm$E,
file = file.path(www.path,
output.name))
# Save v.DGEList ----
# Save in Shiny app Data folder
output.name <- paste0(x, '_vDGEList')
save(v.DGEList.filtered.norm,
file = file.path(app.path,
output.name))
}
knitr::opts_chunk$set(
message = FALSE,
warning = FALSE,
collapse = TRUE
)
if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")
if (!require("tximport", quietly = TRUE)) BiocManager::install("tximport", ask = FALSE)
if (!require("ensembldb", quietly = TRUE)) BiocManager::install("ensembldb", ask = FALSE)
if (!require("biomaRt", quietly = TRUE)) BiocManager::install("biomaRt", ask = FALSE)
if (!require("pacman", quietly = TRUE)) install.packages("pacman")
pacman::p_load("tidyverse","data.table", "magrittr","edgeR","matrixStats","cowplot","ggthemes","gprofiler2","limma","tximport", "ensembldb", "biomaRt", "paletteer", "knitr")
# Check for presence of output folder, generate if it doesn't exist
output.path <- "./Outputs"
if (!dir.exists(output.path)){
dir.create(output.path)
}
app.path <-"../Data"
www.path <-"../www"
species_list <- tibble(species = c('elegans', 'briggsae', 'brenneri', 'remanei', 'japonica'))
for (x in species_list$species) {
# read in the study design ----
targets <- read_tsv(paste0("./Data/", x, "/", x,"_study_design.txt"),
na = c("", "NA", "na"), show_col_types = F)
# load pre-generated annotation information
load(paste0("./Outputs/",x,"_geneAnnotations"))
# import featureCount output into R ----
if (x == 'elegans' | x == 'briggsae') {
path <- paste0("./Data/", x, "/featureCount.C_", x,".", targets$Biological_ID, ".CDS.unique_only.ws290.txt")
} else {
path <- paste0("./Data/", x, "/featureCount.C_", x,".", targets$Biological_ID, ".coding_exon.unique_only.ws290.txt")
}
featureCountData<- rbindlist(lapply(path, fread), idcol="sample") %>%
mutate(sample = targets$sample[sample])
colnames(featureCountData)<-c('sample','geneID', 'Ce_ortholog', 'stableID', "location", "length", "count")
featureCountData_wider <- featureCountData %>%
dplyr::select(!c(Ce_ortholog, location, length)) %>%
pivot_wider(names_from = sample, values_from = count)
counts <- featureCountData_wider %>%
dplyr::select(-stableID)%>%
column_to_rownames(var = "geneID")
annotations_sub<-dplyr::select(featureCountData_wider, c(geneID, stableID)) %>%
left_join(annotations, by = "geneID")
# generate a DGEList
myDGEList <- DGEList(counts,
samples = targets$sample,
group = targets$group,
genes = annotations_sub)
output.name <- paste0(x, '_DGEList')
save(myDGEList,
file = file.path(output.path,
output.name))
}
species_list <- tibble(species = c('elegans', 'briggsae', 'brenneri', 'remanei', 'japonica'))
for (x in species_list$species) {
# load pre-generated DGEList information
load(paste0("./Outputs/",x,"_DGEList"))
# load pre-generated annotation information
load(paste0("./Outputs/",x,"_geneAnnotations"))
# read in the study design ----
targets <- read_tsv(paste0("./Data/", x, "/", x,"_study_design.txt"),
na = c("", "NA", "na"), show_col_types = F)
# Generate life stage IDs
ids <- rep(cbind(targets$group),
times = nrow(myDGEList$counts)) %>%
as_factor()
# calculate and plot log2 counts per million ----
# use the 'cpm' function from EdgeR to get log2 counts per million
# then coerce into a tibble
log2.cpm.df.pivot <-cpm(myDGEList, log=TRUE) %>%
as_tibble(rownames = "geneID") %>%
setNames(nm = c("geneID", targets$sample)) %>%
pivot_longer(cols = -geneID,
names_to = "samples",
values_to = "expression") %>%
add_column(life_stage = ids)
# plot the data
p1 <- ggplot(log2.cpm.df.pivot) +
aes(x=samples, y=expression, fill=life_stage) +
geom_violin(trim = FALSE, show.legend = T, alpha= 0.7) +
stat_summary(fun = "median",
geom = "point",
shape = 20,
size = 2,
color = "black",
show.legend = FALSE) +
labs(y="log2 expression", x = "sample",
title = paste0("C. ", x, ": Log2 Counts per Million (CPM)"),
subtitle="unfiltered, non-normalized",
caption=paste0("produced on ", Sys.time())) +
theme_bw() +
scale_fill_manual(values = paletteer_d("rcartocolor::Prism")) +
coord_flip()
# Filter the data ----
# filter genes/transcripts with low counts
# how many genes had more than 1 CPM (TRUE) in at least n samples
# Note: The cutoff "n" is adjusted for the number of
# samples in the smallest group of comparison.
keepers <- cpm(myDGEList) %>%
rowSums(.>1)>=1
myDGEList.filtered <- myDGEList[keepers,]
ids.filtered <- rep(cbind(targets$group),
times = nrow(myDGEList.filtered)) %>%
as_factor()
log2.cpm.filtered.df.pivot <- cpm(myDGEList.filtered, log=TRUE) %>%
as_tibble(rownames = "geneID") %>%
setNames(nm = c("geneID", targets$sample)) %>%
pivot_longer(cols = -geneID,
names_to = "samples",
values_to = "expression") %>%
add_column(life_stage = ids.filtered)
p2 <- ggplot(log2.cpm.filtered.df.pivot) +
aes(x=samples, y=expression, fill=life_stage) +
geom_violin(trim = FALSE, show.legend = T, alpha= 0.7) +
stat_summary(fun = "median",
geom = "point",
shape = 20,
size = 2,
color = "black",
show.legend = FALSE) +
labs(y="log2 expression", x = "sample",
title = paste0("C. ", x, ": Log2 Counts per Million (CPM)"),
subtitle="filtered, non-normalized",
caption=paste0("produced on ", Sys.time())) +
theme_bw() +
scale_fill_manual(values = paletteer_d("rcartocolor::Prism")) +
coord_flip()
# Look at the genes excluded by the filtering step ----
# just to check that there aren't any with
# high expression that are in few samples
# Discarded genes
myDGEList.discarded <- myDGEList[!keepers,]
ids.discarded <- rep(cbind(targets$group),
times = nrow(myDGEList.discarded)) %>%
as_factor()
log2.cpm.discarded.df.pivot <- cpm(myDGEList.discarded, log=F) %>%
as_tibble(rownames = "geneID") %>%
setNames(nm = c("geneID", targets$sample)) %>%
pivot_longer(cols = -geneID,
names_to = "samples",
values_to = "expression") %>%
add_column(life_stage = ids.discarded)
# Genes that are above 1 cpm
log2.cpm.discarded.df.pivot %>%
dplyr::filter(expression > 1)
# Generate a matrix of discarded genes and their raw counts ----
discarded.gene.df <- log2.cpm.discarded.df.pivot %>%
pivot_wider(names_from = c(life_stage, samples),
names_sep = "-",
values_from = expression,
id_cols = geneID)%>%
left_join(annotations, by = "geneID")
# Save a matrix of discarded genes and their raw counts ----
discarded.gene.df %>%
write.csv(file = file.path(output.path,
"SsRNAseq_discardedGenes.csv"))
# Plot discarded genes
p.discarded <- ggplot(log2.cpm.discarded.df.pivot) +
aes(x=samples, y=expression, color=life_stage) +
geom_jitter(alpha = 0.3, show.legend = T)+
stat_summary(fun = "median",
geom = "point",
shape = 20,
size = 2,
color = "black",
show.legend = FALSE) +
labs(y="expression", x = "sample",
title = paste0("C. ", x, ": Counts per Million (CPM)"),
subtitle="genes excluded by low count filtering step, non-normalized",
caption=paste0("produced on ", Sys.time())) +
theme_bw() +
scale_fill_manual(values = paletteer_d("rcartocolor::Prism")) +
coord_flip()
# Normalize the data using a between samples normalization ----
# Source for TMM sample normalization here:
# https://genomebiology.biomedcentral.com/articles/10.1186/gb-2010-11-3-r25
myDGEList.filtered.norm <- calcNormFactors(myDGEList.filtered, method = "TMM")
log2.cpm.filtered.norm <- cpm(myDGEList.filtered.norm, log=TRUE)
log2.cpm.filtered.norm.df<- cpm(myDGEList.filtered.norm, log=TRUE) %>%
as_tibble(rownames = "geneID") %>%
setNames(nm = c("geneID", targets$sample))
log2.cpm.filtered.norm.df.pivot<-log2.cpm.filtered.norm.df %>%
pivot_longer(cols = -geneID,
names_to = "samples",
values_to = "expression") %>%
add_column(life_stage = ids.filtered)
p3 <- ggplot(log2.cpm.filtered.norm.df.pivot) +
aes(x=samples, y=expression, fill=life_stage) +
geom_violin(trim = FALSE, show.legend = T, alpha = 0.7) +
stat_summary(fun = "median",
geom = "point",
shape = 20,
size = 2,
color = "black",
show.legend = FALSE) +
labs(y="log2 expression", x = "sample",
title = paste0("C. ", x, ": Log2 Counts per Million (CPM)"),
subtitle="filtered, TMM normalized",
caption=paste0("produced on ", Sys.time())) +
theme_bw() +
scale_fill_manual(values = paletteer_d("rcartocolor::Prism")) +
coord_flip()
output.name <- paste0(x, '_FilteringNormalizationGraphs')
save(p1, p2, p3, p.discarded, discarded.gene.df,
file = file.path(output.path,
output.name))
output.name <- paste0(x, '_DGEList_filtered_normalized')
save(myDGEList.filtered.norm,
file = file.path(output.path,
output.name))
# Compute Variance-Stabilized DGEList Object ----
# Set up the design matrix ----
# no intercept/blocking for matrix, comparisons across group
group <- factor(targets$group)
design <- model.matrix(~0 + group)
colnames(design) <- levels(group)
# NOTE: To handle a 'blocking' design' or a batch effect, use:
# design <- model.matrix(~block + treatment)
# Model mean-variance trend and fit linear model to data ----
# Use VOOM function from Limma package to model the mean-variance relationship
# produces a variance-stabilized DGEList, that include precision
# weights for each gene to try and control for heteroscedasity.
# transforms count data to log2-counts per million
# Outputs: E = normalized expression values on the log2 scale
v.DGEList.filtered.norm <- voom(counts = myDGEList.filtered.norm,
design = design, plot = T)
colnames(v.DGEList.filtered.norm)<-targets$sample
colnames(v.DGEList.filtered.norm$E) <- paste(targets$group,
targets$sample,sep = '-')
# Save matrix of genes and their filtered, normalized, voom-transformed counts ----
# This is the count data that underlies the differential expression analyses in the Shiny app.
# Saving it here so that users of the app can access the input information.
output.name <- paste0(x, '_log2cpm_filtered_norm_voom.csv')
# Save in Shiny app download (www) folder
write.csv(v.DGEList.filtered.norm$E,
file = file.path(www.path,
output.name))
# Save v.DGEList ----
# Save in Shiny app Data folder
output.name <- paste0(x, '_vDGEList')
save(v.DGEList.filtered.norm,
file = file.path(app.path,
output.name))
}
load(paste0("./Outputs/elegans_FilteringNormalizationGraphs"))
p1
p2
p3
p.discarded
DT::datatable(discarded.gene.df,
rownames = FALSE,
escape = FALSE,
options = list(autoWidth = TRUE,
scrollX = TRUE,
scrollY = '300px',
scrollCollapse = TRUE,
searchHighlight = TRUE,
pageLength = 10,
lengthMenu = c("5",
"10",
"25",
"50",
"100"),
initComplete = htmlwidgets::JS(
"function(settings, json) {",
paste0("$(this.api().table().container()).css({'font-size': '", "10pt", "'});"),
"}")
)) %>%
DT::formatRound(columns=c(2:(ncol(discarded.gene.df)-10)),
digits=3)
rm("p1", "p2", "p3", "p.discarded", "discarded.gene.df")
load(paste0("./Outputs/briggsae_FilteringNormalizationGraphs"))
p1
p2
p3
p.discarded
DT::datatable(discarded.gene.df,
rownames = FALSE,
escape = FALSE,
options = list(autoWidth = TRUE,
scrollX = TRUE,
scrollY = '300px',
scrollCollapse = TRUE,
searchHighlight = TRUE,
pageLength = 10,
lengthMenu = c("5",
"10",
"25",
"50",
"100"),
initComplete = htmlwidgets::JS(
"function(settings, json) {",
paste0("$(this.api().table().container()).css({'font-size': '", "10pt", "'});"),
"}")
)) %>%
DT::formatRound(columns=c(2:(ncol(discarded.gene.df)-10)),
digits=3)
rm("p1", "p2", "p3", "p.discarded", "discarded.gene.df")
load(paste0("./Outputs/brenneri_FilteringNormalizationGraphs"))
p1
p2
p3
p.discarded
DT::datatable(discarded.gene.df,
rownames = FALSE,
escape = FALSE,
options = list(autoWidth = TRUE,
scrollX = TRUE,
scrollY = '300px',
scrollCollapse = TRUE,
searchHighlight = TRUE,
pageLength = 10,
lengthMenu = c("5",
"10",
"25",
"50",
"100"),
initComplete = htmlwidgets::JS(
"function(settings, json) {",
paste0("$(this.api().table().container()).css({'font-size': '", "10pt", "'});"),
"}")
)) %>%
DT::formatRound(columns=c(2:(ncol(discarded.gene.df)-10)),
digits=3)
rm("p1", "p2", "p3", "p.discarded", "discarded.gene.df")
load(paste0("./Outputs/remanei_FilteringNormalizationGraphs"))
p1
p2
p3
p.discarded
DT::datatable(discarded.gene.df,
rownames = FALSE,
escape = FALSE,
options = list(autoWidth = TRUE,
scrollX = TRUE,
scrollY = '300px',
scrollCollapse = TRUE,
searchHighlight = TRUE,
pageLength = 10,
lengthMenu = c("5",
"10",
"25",
"50",
"100"),
initComplete = htmlwidgets::JS(
"function(settings, json) {",
paste0("$(this.api().table().container()).css({'font-size': '", "10pt", "'});"),
"}")
)) %>%
DT::formatRound(columns=c(2:(ncol(discarded.gene.df)-10)),
digits=3)
rm("p1", "p2", "p3", "p.discarded", "discarded.gene.df")
load(paste0("./Outputs/japonica_FilteringNormalizationGraphs"))
p1
p2
p3
p.discarded
DT::datatable(discarded.gene.df,
rownames = FALSE,
escape = FALSE,
options = list(autoWidth = TRUE,
scrollX = TRUE,
scrollY = '300px',
scrollCollapse = TRUE,
searchHighlight = TRUE,
pageLength = 10,
lengthMenu = c("5",
"10",
"25",
"50",
"100"),
initComplete = htmlwidgets::JS(
"function(settings, json) {",
paste0("$(this.api().table().container()).css({'font-size': '", "10pt", "'});"),
"}")
)) %>%
DT::formatRound(columns=c(2:(ncol(discarded.gene.df)-10)),
digits=3)
sessionInfo()
sessionInfo()
## R version 4.3.2 (2023-10-31)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.4
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/Los_Angeles
## tzcode source: internal
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] knitr_1.45 paletteer_1.5.0 gprofiler2_0.2.2
## [4] ggthemes_5.0.0 cowplot_1.1.1 matrixStats_1.1.0
## [7] edgeR_4.0.2 limma_3.58.1 magrittr_2.0.3
## [10] data.table_1.14.8 lubridate_1.9.3 forcats_1.0.0
## [13] stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2
## [16] readr_2.1.4 tidyr_1.3.0 tibble_3.2.1
## [19] ggplot2_3.4.4 tidyverse_2.0.0 pacman_0.5.1
## [22] biomaRt_2.58.0 ensembldb_2.26.0 AnnotationFilter_1.26.0
## [25] GenomicFeatures_1.54.1 AnnotationDbi_1.64.1 Biobase_2.62.0
## [28] GenomicRanges_1.54.1 GenomeInfoDb_1.38.1 IRanges_2.36.0
## [31] S4Vectors_0.40.2 BiocGenerics_0.48.1 tximport_1.30.0
## [34] BiocManager_1.30.22
##
## loaded via a namespace (and not attached):
## [1] rstudioapi_0.15.0 jsonlite_1.8.7
## [3] farver_2.1.1 rmarkdown_2.25
## [5] BiocIO_1.12.0 zlibbioc_1.48.0
## [7] vctrs_0.6.4 memoise_2.0.1
## [9] Rsamtools_2.18.0 RCurl_1.98-1.13
## [11] htmltools_0.5.7 S4Arrays_1.2.0
## [13] progress_1.2.2 curl_5.1.0
## [15] SparseArray_1.2.2 sass_0.4.7
## [17] bslib_0.6.1 htmlwidgets_1.6.3
## [19] plotly_4.10.3 cachem_1.0.8
## [21] GenomicAlignments_1.38.0 lifecycle_1.0.4
## [23] pkgconfig_2.0.3 Matrix_1.6-4
## [25] R6_2.5.1 fastmap_1.1.1
## [27] GenomeInfoDbData_1.2.11 MatrixGenerics_1.14.0
## [29] digest_0.6.33 colorspace_2.1-1
## [31] rematch2_2.1.2 prismatic_1.1.1
## [33] crosstalk_1.2.1 RSQLite_2.3.3
## [35] labeling_0.4.3 filelock_1.0.2
## [37] fansi_1.0.5 timechange_0.2.0
## [39] httr_1.4.7 abind_1.4-7
## [41] compiler_4.3.2 bit64_4.0.5
## [43] withr_2.5.2 BiocParallel_1.36.0
## [45] DBI_1.1.3 highr_0.10
## [47] rappdirs_0.3.3 DelayedArray_0.28.0
## [49] rjson_0.2.21 tools_4.3.2
## [51] glue_1.6.2 restfulr_0.0.15
## [53] grid_4.3.2 generics_0.1.3
## [55] gtable_0.3.4 tzdb_0.4.0
## [57] hms_1.1.3 xml2_1.3.5
## [59] utf8_1.2.4 XVector_0.42.0
## [61] pillar_1.9.0 vroom_1.6.4
## [63] BiocFileCache_2.10.1 lattice_0.22-5
## [65] rtracklayer_1.62.0 bit_4.0.5
## [67] tidyselect_1.2.0 locfit_1.5-9.8
## [69] Biostrings_2.70.1 ProtGenerics_1.34.0
## [71] SummarizedExperiment_1.32.0 xfun_0.41
## [73] statmod_1.5.0 DT_0.30
## [75] stringi_1.8.2 lazyeval_0.2.2
## [77] yaml_2.3.7 evaluate_0.23
## [79] codetools_0.2-19 cli_3.6.1
## [81] munsell_0.5.0 jquerylib_0.1.4
## [83] Rcpp_1.0.11 dbplyr_2.4.0
## [85] png_0.1-8 XML_3.99-0.16
## [87] parallel_4.3.2 ellipsis_0.3.2
## [89] blob_1.2.4 prettyunits_1.2.0
## [91] bitops_1.0-7 viridisLite_0.4.2
## [93] scales_1.3.0 crayon_1.5.2
## [95] rlang_1.1.2 KEGGREST_1.42.0